Georgia Institute of Technology - Timeliner

VAST 2009 Challenge
Challenge 1: -  Badge and Network Traffic

Authors and Affiliations:

Jaegul Choo, Georgia Institute of Technology, joyfull@cc.gatech.edu  [PRIMARY contact]
Pedro A. R. Walteros, Georgia Institute of Technology, pedro.andres.rangel@gmail.com
Wenjing Li, Georgia Institute of Technology, wli35@gatech.com

Tool(s):

Timeliner visualizes various types of data, e.g., Prox Card and IP Traffic data, along the timeline. It was developed specifically for Challenge 1 by Dr. Carsten Görg at Georgia Tech. It has two views, Per-employee and Per-day. Per-employee view, e.g., Figure 1 and 2, visualizes a particular employee’s data where horizontal and vertical axes denote hours and dates. Per-day view, e.g., Figure 3-5, visualizes data in a particular day where horizontal and vertical axes correspond to hours and employee ID/source IP. In both views, data can be filtered out based on:

l  Either of employee ID or date

l  Port no, e.g. email (25), http (80), proxy (8080)

l  Minimum request and response sizes

l  Destination IP

The legend includes:

l  Blue ‘x’: prox-in-building

l  Short vertical line: network data instance. Green, yellow, and orange colors correspond to port no 25, 80, and 8080, respectively.

l  Red horizontal line: presence in the classified area by matching prox-in-classified and prox-out-classified

l  Red ‘x’: unmatched prox-in-classified or prox-out-classified, i.e., if we have two prox-in-classified’s or two prox-out-classified’s in a row, the second one is marked using this.

Timeliner interaction includes:

l  Once the mouse pointer is located, the details of the entry are shown in the top.

l  Clicking a particular network instance filters based on its destination IP. By checking the checkbox, ‘mark IP’, those entries are highlighted with white circles while the other entries are still shown.

l  Right clicking displays a long vertical line representing a specific time.

 

Video:

 

VideoMC1.swf.

 

ANSWERS:


MC1.1: Identify which computer(s) the employee most likely used to send information to his contact in a tab-delimited table which contains for each computer identified: when the information was sent, how much information was sent and where that information was sent. 

Traffic.txt.

 


MC1.2:  Characterize the patterns of behavior of suspicious computer use.

Initially, we looked into the network instances with large request sizes for suspicious computer usage. We soon recognized a few cases among them where a computer was used while its owner was in the classified area, proving that somebody else used his computer. We call them DeceptiveUsages. Then we found they all have the destination IP, ‘100.59.151.133’, with the port, ‘8080’.

Next, we filtered the entire network data based on this destination IP, and found that they all shared also the same port, ‘8080’, with large request sizes. We call them SuspDstUsages, an acronym for usages with a suspicious destination IP. Clearly, SuspDstUsages is a superset of DeceptiveUsages. We also observed varying source IPs, allowing us to believe that the spy probably used other computers to hide identity.

Next we tried to pinpoint the spy. Assuming only one spy can be present at only one place, we excluded the computer owners in DeceptiveUsages from the suspect list. However, those not in DeceptiveUsages but in SuspDstUsages are possible spy candidates. Our analysis indicated that most of SuspDstUsages occurred on omputers during their owner’s absence from the office. Naturally, we can think of their officemates right next to them as alternative suspects since they can easily access their officemates’ computers without being noticed. Investigation with those suspects led us to Employee 30 as the most plausible spy. In what follows, we present the details of these analyses.

First, we looked into each employee’s data in SuspDstUsage. Per-employee views, Figures 1-2, show daily patterns of a particular employee. Figure 1 visualizes the data of employee 31, one of the most frequent source IPs in SuspDstUsages. In all the Figures included, SuspDstUsages were highlighted using white circles based on destination IP filtering. In Figure 1, from blue ‘x’ marks, we can see that he arrives at the office around 10~11am, returns from lunch around 4~5pm, and does not piggyback often. From the activities that were made at the end of each day, we can infer when he left the office. Among highlighted entries, the second one from the top is one of the DeceptiveUsages since it overlaps with his presence in the classified area, i.e., a red horizontal line right over it. The other two look somewhat isolated from near network usages in each corresponding day. Although he shows ‘prox-in-building’ 18 minutes and 39 minutes before the first and the third entries, respectively, if we ignore these two highlighted entries, there are noticeable gaps in timeline between such ‘prox-in-building’ and the next network usages. It means he probably did not get back to his room for a while after entering the building, possibly due to a meeting or another even, possibly giving the spy time to use his computer during his absence. Maybe he may have spent some time in places other than his office and the classified area for meeting or something. A similar explanation can be made from Figure 2, which is the Per-employee view of employee 18. Here the highlighted entry looks even more isolated from the near network usages in the corresponding day, since he does not show network usages after 3:43pm, and then suddenly the highlighted network usage occurs at 5:57pm.. This means he probably left the office at 3:43pm, and the spy used his computer at 5:57pm. Not included in the Figures, Per-employee view of employee 13 has another type of such SuspDstUsages that occurred about an hour before he entered the office in the morning. These analyses with every entry in SuspDstUsages show that none of the computer owners in SuspDstUsages could be the spy, which means the spy never used his own computer when sending large amounts of data.

Next we focused on their officemates. Figures 3-5 shows Per-day views of 15, 17, and 29, respectively, where we can see other employees’ activities at a specific time. The horizontal timelines were made thicker for employee 30, who is our final suspect. Long vertical lines are used as a reference time of interest interacted by right-clicking, and since the screen shot can only include a single line, I put it on a more important entry. In Figure 3, the first entry is one of DeceptiveUsages, and more importantly, we observed there are no activities from his officemate, employee 17, as well. Since the room was vacant, it might have enabled employee 30 to access the room, and use either computer. The computer in the other entry belongs to employee 30’s officemate. Employee 30’s network usage before and after this entry means that he was in the room at that time. He could have easily used his officemate’s computer once his officemate was absent, which is why employee 30’s computer was used frequently in SuspDstUages. In Figure 4, an interesting case is found where employee 30 shows ‘prox-out-classified’, but there are no matching ‘prox-in-classified’, which is shown as a red ‘x’ mark. There were only three such cases in the entire data, all of which are from employee 30. He may have wanted to hide how long he stayed in the classified area. Similarly to Figure 3, two highlighted entries in Figure 4 occurred when both employees of the room were absent. In Figure 5, around the time of the last two highlighted entries, employee 30 shows about a 30 minute gap in his network usage, which makes it reasonable that he left the desk in search for a computer to use and then came back after such usage.

While performing these analyses on SuspDstUsages, it sometimes did not make sense. For instance, employee 30 showed network usage just 1 second later than a particular entry, and it would be impossible that he could come back from another room within 1 second. Additionally, the computer owner came back 2~3 minutes after some entries in SuspDstUsages, as shown in the first entry of Figure 5. It could imply that the spy was alerted, which is why we think there may be accomplices though our analyses failed in identifying any.

 

 

Figures (resized): 

Figure 1: Fig1_PerEm31.PNG

Fig1_PerEm31

Figure 2: Fig2_PerEm18.PNG

Fig2_PerEm18

Figure 3: Fig3_PerDay15.PNG

Fig3_PerDay15

Figure 4: Fig4_PerDay17.PNG

Fig4_PerDay17

Figure 5: Fig5_PerDay29.PNG

Fig5_PerDay29

 

Links to Figures (with original sizes)

Figure 1: Fig1_PerEm31.PNG

Figure 2: Fig2_PerEm18.PNG

Figure 3: Fig3_PerDay15.PNG

Figure 4: Fig4_PerDay17.PNG

Figure 5: Fig5_PerDay29.PNG